An integrated top-down/bottom-up approach to speaker diarization

نویسندگان

  • Simon Bozonnet
  • Nicholas W. D. Evans
  • Corinne Fredouille
  • Dong Wang
  • Raphaël Troncy
چکیده

Most speaker diarization systems fit into one of two categories: bottom-up or top-down. Bottom-up systems are the most popular but can sometimes suffer from instability from merging and stopping criteria difficulties. Top-down systems deliver competitive results but are particularly prone to poor model initialization which often leads to large variations in performance. This paper presents a new integrated bottom-up/topdown approach to speaker diarization which aims to harness the strengths of each system and thus to improve performance and stability. In contrast to previous work, here the two systems are fused at the heart of the segmentation and clustering stage. Experimental results show improvements in speaker diarization performance for both meeting and TV-show domain data indicating increased intra and inter-domain stability. On the TVshow data in particular, an average relative improvement of 32% DER is obtained.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The LIA-EURECOM RT‘09 Speaker Diarization System

This paper presents LIA-EURECOM’s joint submission to the NIST Rich Transcription 2009 (RT‘09) speaker diarization evaluation. We describe a number of modifications to our previous system which involve beamforming for the multiple distant microphone (MDM) condition and also significant enhancements to the speaker segmentation stage of the core speaker diarization system. These modifications lea...

متن کامل

Convolutive Non-Negative Sparse Coding and New Features for Speech Overlap Handling in Speaker Diarization

The effective handling of overlapping speech is at the limits of the current state of the art in speaker diarization. This paper presents our latest work in overlap detection. We report the combination of features derived through convolutive nonnegative sparse coding and new energy, spectral and voicingrelated features within a conventional HMM system. Overlap detection results are fully integr...

متن کامل

Prosodic and Phonetic Features for Speaker Clustering in Speaker Diarization Systems

This work is focused on speaker clustering methods that are used in speaker diarization systems. The purpose of speaker clustering is to associate together segments that belong to the same speaker and is usually applied in the last stage of the speaker-diarization process. We concentrate on developing proper representations of speaker segments for clustering. We realize two different speaker cl...

متن کامل

System output combination for improved speaker diarization

System combination or fusion is a popular, successful and sometimes straightforward means of improving performance in many fields of statistical pattern classification, including speech and speaker recognition. Whilst there is significant work in the literature which aims to improve speaker diarization performance by combining multiple feature streams, there is little work which aims to combine...

متن کامل

A Comparative Study of Effect of Bottom-up and Top-down Instructional Approaches on EFL Learners’ Vocabulary Recall and Retention

This quasi-experimental study investigated the effect of bottom-up and top-down instructional approaches on English as a foreign language (EFL) vocabulary recall and retention. To this end, 44 high school students from two intact classes were assigned to bottom-up (n = 21) and top-down (n = 23) groups. The participants were exposed to 20 hours of explicit vocabulary instruction during 10 weeks ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010